62 research outputs found
Inference for determinantal point processes without spectral knowledge
Determinantal point processes (DPPs) are point process models that naturally
encode diversity between the points of a given realization, through a positive
definite kernel . DPPs possess desirable properties, such as exact sampling
or analyticity of the moments, but learning the parameters of kernel
through likelihood-based inference is not straightforward. First, the kernel
that appears in the likelihood is not , but another kernel related to
through an often intractable spectral decomposition. This issue is
typically bypassed in machine learning by directly parametrizing the kernel
, at the price of some interpretability of the model parameters. We follow
this approach here. Second, the likelihood has an intractable normalizing
constant, which takes the form of a large determinant in the case of a DPP over
a finite set of objects, and the form of a Fredholm determinant in the case of
a DPP over a continuous domain. Our main contribution is to derive bounds on
the likelihood of a DPP, both for finite and continuous domains. Unlike
previous work, our bounds are cheap to evaluate since they do not rely on
approximating the spectrum of a large matrix or an operator. Through usual
arguments, these bounds thus yield cheap variational inference and moderately
expensive exact Markov chain Monte Carlo inference methods for DPPs
Optimal Preconditioning and Fisher Adaptive Langevin Sampling
We define an optimal preconditioning for the Langevin diffusion by
analytically optimizing the expected squared jumped distance. This yields as
the optimal preconditioning an inverse Fisher information covariance matrix,
where the covariance matrix is computed as the outer product of log target
gradients averaged under the target. We apply this result to the Metropolis
adjusted Langevin algorithm (MALA) and derive a computationally efficient
adaptive MCMC scheme that learns the preconditioning from the history of
gradients produced as the algorithm runs. We show in several experiments that
the proposed algorithm is very robust in high dimensions and significantly
outperforms other methods, including a closely related adaptive MALA scheme
that learns the preconditioning with standard adaptive MCMC as well as the
position-dependent Riemannian manifold MALA sampler.Comment: 21 pages, 15 figure
Manifold Relevance Determination
In this paper we present a fully Bayesian latent variable model which
exploits conditional nonlinear(in)-dependence structures to learn an efficient
latent representation. The latent space is factorized to represent shared and
private information from multiple views of the data. In contrast to previous
approaches, we introduce a relaxation to the discrete segmentation and allow
for a "softly" shared latent space. Further, Bayesian techniques allow us to
automatically estimate the dimensionality of the latent spaces. The model is
capable of capturing structure underlying extremely high dimensional spaces.
This is illustrated by modelling unprocessed images with tenths of thousands of
pixels. This also allows us to directly generate novel images from the trained
model by sampling from the discovered latent spaces. We also demonstrate the
model by prediction of human pose in an ambiguous setting. Our Bayesian
framework allows us to perform disambiguation in a principled manner by
including latent space priors which incorporate the dynamic nature of the data.Comment: ICML201
- …